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Inspired by ideas of Chomsky, Bar-Hillel, Ginsburg, and their coworkers, I spent the summer of 
1964 drafting Chapter 11 of a book I had been asked to write. The main purpose of that book, 
tentatively entitled The Art of Computer Programming, was to explain how to write compilers; 
compilation was to be the subject of the twelfth and final chapter. Chapter 10 was called "Parsing," 
and Chapter 11 was "The theory of languages." I wrote the drafts of these chapters in the order 
11, 10, 12, because Chapter 11 was the most fun to do. 
| Terminology and notation for formal linguistics were in a great state of flux in the early 60s, 

Q^ ■ so it was natural for me to experiment with new ways to define the notion of what was then being 

called a "Chomsky type 2" or "ALGOL-like" or "definable" or "phrase structure" or "context-free" 
language. As I wrote Chapter 11, I made two changes to the definitions that had been appearing 
in the literature. The first of these was comparatively trivial, although it simplified the statements 
and proofs of quite a few theorems: I replaced the "starting symbol" S by a "starting set" of 
strings from which the language was derived. The second change was more substantial: I decided 
to keep track of the multiplicity of strings in the language, so that a string would appear several 
times if there were several ways to parse it. This second change was natural from a programmer's 
Q ■ viewpoint, because transformations on context-free grammars had proved to be most interesting in 

c/3 . practice when they yielded isomorphisms between parse trees. 

I never discussed these ideas in journal articles at the time, because I thought my book would 
soon be ready for publication. (I published an article about LR(/c) grammars [4] only because 
it was an idea that occurred to me after finishing the draft of Chapter 10; the whole concept of 
LR(/c) ws well beyond the scope of my book, as envisioned in 1964.) My paper on parenthesis 
grammars [5] did make use of starting sets, but in my other relevant papers [4, 6, 8] I stuck with 
the more conventional use of a starting symbol S. I hinted at the importance of multiplicity in the 
answer to exercise 4.6.3-19 of The Art of Computer Programming (written in 1967, published in 
1969 [7]): "The terminal strings of a noncircular context-free grammar form a multiset which is a 
set if and only if the grammar is unambiguous." But as the years went by and computer science 
^ continued its explosive growth, I found it more and more difficult to complete final drafts of the 

early chapters, and the date for the publication of Chapter 11 kept advancing faster than the clock 
was ticking. 

^ ■ Some of the early literature of context-free grammars referred to "strong equivalence," which 

meant that the multiplicities 0, 1, and > 2 were preserved; if Q\ was strongly equivalent to Q2, then 
Qi was ambiguous iff Qi was ambiguous. But this concept did not become prominent enough to 
deserve mention in the standard textbook on the subject [1]. 

The occasion of Seymour Ginsburg's 64th birthday has reminded me that the simple ideas 
I played with in '64 ought to be aired before too many more years go by. Therefore I would 
like to sketch here the basic principles I plan to expound in Chapter 11 of The Art of Computer 
Programming when it is finally completed and published — currently scheduled for the year 2008. 
My treatment will be largely informal, but I trust that interested readers will see easily how to 
make everything rigorous. If these ideas have any merit they may lead some readers to discover new 
results that will cause further delays in the publication of Chapter 11. That is a risk I'm willing to 
take. 



1. Multisets. A multiset is like a set, but its elements can appear more than once. An element 
can in fact appear infinitely often, in an infinite multiset. The multiset containing 3 a's and 2 6's 
can be written in various ways, such as {a, a, a, b, b}, {a, a, b, a, &}, or {3 • a, 2 • b}. If A is a multiset 



1 



of objects and if x is an object, [x] A denotes the number of times x occurs in A; this is either a 
nonnegative integer or oo. We have A C B when [x] A < [x]B for all x; thus A = B if and only 
A C B and B C A. A multiset is a se£ if no element occurs more than once, i.e., if [x] A < 1 for 
all x. If A and B are multisets, we define A n , AU B, A n B, Akt> B, and ,4 n B by the rules 

[x] A n = min(l, [x]) ; 
[x] (4UB) = max([s] A, [x] B) ; 
[x] (A n B) = min([x] A, [x] B) ; 
[x] (AW5) = ([x]A) + ([ar]B); 
[x] (AnB) = ([x]A) + ([x]B). 

(We assume here that oo plus anything is oo and that times anything is 0.) Two multisets A 
and B are similar, written A x B, if A n = B n ; this means they would agree as sets, if multiplicities 
were ignored. Notice that A U B x A t±J B and i4 n B x inB. All four binary operations are 
associative and commutative; several distributive laws also hold, e.g., 

(A n B) n c = (A n C) n (B n C) . 

Multiplicities are taken into account when multisets appear as index sets (or rather as "index 
multisets"). For example, if A = {2, 2, 3, 5, 5, 5}, we have 

{x-l|xG^} = {l,l,2,4,4,4}; 

^(x-l) = ^{x-l |xG^} = 16; 

xeA 

l+J B x = B 2 y B 2 W B 3 w B 5 y B 5 W B 5 . 

If P(ra) is the multiset of prime factors of n, we have Y\{p \ p G P(n) } = n for all positive 
integers n. 

If A and B are multisets, we also write 

A + B = {a + b\ aeA,beB}, 
AB = {ab\ a £ A,b € B}; 

therefore if A has m elements and B has n elements, both multisets A+B and AB have mn elements. 
Notice that 

[x\{A + B) = ^[x - a]B = ^[x - b]A 

aeA beB 

where [x = a + b] is 1 if x = a + b and otherwise. Similar formulas hold for [x] {AB). 
It is convenient to let Ab stand for the multiset 

Ab = {ab\ae A} = A{b} ; 

similarly, aB stands for {a}B. This means, for example, that 2A is not the same as A + A; a special 
notation, perhaps n * A, is needed for the multiset 

n times 



A-\ h A = { ai H h a n | aj e A for 1 < j < n } . 
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Similarly we need notations to distinguish the multiset 

AA = { aa' \ a, a G A } 

from the quite different multiset 

{a 2 | a G A} = {aa \ a G A} . 

The product 

n times 

A . . . A = { a\ . . . a n \ aj G A for 1 < j < n } 
is traditionally written A n , and I propose writing 

A] n = {a n \ a e A} = {a] n \ a e A} 

on the rarer occasions when we need to deal with multisets of nth powers. 

Multilanguages. A multilanguage is like a language, but its elements can appear more than once. 
Thus, if we regard a language as a set of strings, a multilanguage is a multiset of strings. 

An alphabet is a finite set of disinguishable characters. If U is an alphabet, U* denotes the 
set of all strings over U. Strings are generally represented by lowercase Greek letters; the empty 
string is called e. If A is any multilanguage, we write 

A° = {e} , 

A* =A°ttA 1 \SA 2 \S 

this will be a language (i.e., a set) if and only if the string equation ct\ . . . a m = a[ . . . a' m , for 
cci, . . . , a m , a[, . . . , a' m , G A implies that m = m! and that oik = a' k for 1 < k < m. If e ^ A, every 
element of A* has finite multiplicity; otherwise every element of A* has infinite multiplicity. 

A context-free grammar Q has four component parts (T, N, S,V): T is an alphabet of terminals; 
N is an alphabet of nonterminals, disjoint from T; S is a finite multiset of starting strings over the 
alphabet V = T U N; and V is a finite multiset of productions, where each production has the form 

A ->■ 9 , for some A G N and 6 G V*. 

We usually use lowercase letters to represent elements of T, upper case letters to represent elements 
of N. The starting strings and the righthand sides of all productions are called the basic strings 
of Q. The multiset {6 \ A ^ 9 ^V} is denoted by V{A); thus we can regard V as a mapping 
from to multisets of strings over V. 

The productions are extended to relations between strings in the usual way. Namely, if A — > 
is in V, we say that aAto produces a9u) for all strings a and uj in V*; in symbols, olAuj — > olOuj. We 
also write a —> n r if a produces r in n steps; this means that there are strings <t , ci, ■ • • , o~ n in V* 
such that Co = a, aj-i — > o-j for 1 < j < n, and <t„ = r. Furthermore we write <r ^* r if a -* n r 
for some n > 0, and cr ^ + r if <r ^ n r for some n > 1. 

A parse Yl for is an ordered forest in which each node is labeled with a symbol of V; each 
internal (non-leaf) node is also labeled with a production of V. An internal node whose production 
label is A — > v\ . . . vi must be labeled with the symbol A, and it must have exactly I children labeled 
vi, . . . , vi, respectively. If the labels of the root nodes form the string a and the labels of the leaf 




ra>0 
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nodes form the string r, and if there are n internal nodes, we say that 77 parses r as a in n steps. 
There is an n-step parse of r as a if and only if a r. 

In many applications, we are interested in the number of parses; so we let L(a) be the multiset 
of all strings r G T* such that a ^* r, with each r occurring exactly as often as there are parses 
of r as a. This defines a multilanguage L(a) for each a EV*. 

It is not difficult to see that the multilanguages L{a) are characterized by the following multiset 
equations: 

L( T ) = { r } , for all r G T* ; 

L(A) = L(0) | 6> G V{A) } , for all A GiV; 

L(cjfT / ) = L(a)L(a') , for all a, a' £ V* . 

According to the conventions outlined above, the stated formula for L(A) takes account of mul- 
tiplicities, if any productions A — > 9 are repeated in V. Parse trees that use different copies of 
the same production are considered different; we can, for example, assign a unique number to each 
production, and use that number as the production label on internal nodes of the parse. 

Notice that the multiplicity of r in L(a) is the number of parses of r as a, not the number 
of derivations a = a — > • • • — > a n = r. For example, if V contains just two productions {^4 — > a, 
B — > 6}, then L{AB) = {ab} corresponds to the unique parse 

A B 
a b 

although there are two derivation AB — > — > a6 and AB — > aB — > a6. 

The multilanguages L(<r) depend only on the alphabets T U N and the productions "P. The 
multilanguage defined by Q, denoted by is the multiset of strings parsable from the starting 

strings S, counting multiplicity: 

L(g)=\^{L(a) \a€S}. 

Transformations. Programmers are especially interested in the way L(Q) changes when Q is 
modified. For example, we often want to simplify grammars or put them into standard forms 
without changing the strings of L{Q) or their multiplicities. 

A nonterminal symbol A is useless if it never occurs in any parses of strings in L(Q). This 
happens iff either L(A) = or there are no strings a G S, a G V* , and u> G V* such that a aAto. 
We can remove all productions of V and all strings of S that contain useless nonterminals, without 
changing L(Q). A grammar is said to be reduced if every element of N is useful. 

Several basic transformations can be applied to any grammar without affecting the multi- 
language L(Q). One of these transformations is called abbreviation: Let X be a new symbol ^ V 
and let be any string of V*. Add X to N and add the production X — > to V. Then we can 
replace 9 by X wherever 9 occurs as a substring of a basic string, except in the production X — > 9 
itself, without changing L(Q); this follows from the fact that L(X) = L{9). By repeated use of 
abbreviations we can obtain an equivalent grammar whose basic strings all have length 2 or less. 
The total length of all basic strings in the new grammar is less than twice the total length of all 
basic strings in the original. 

Another simple transformation, sort of an inverse to abbreviation, is called expansion. It 
replaces any basic string of the form aXuj by the multiset of all strings aOuo where X — > 9. If aXuj 
is the right-hand side of some production A — > aXuj, this means that the production is replaced 
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in V by the multiset of productions { A — > aOuj \ 9 G V(X)}; we are essentially replacing the 
element aXui of V(A) by the multiset {aOto \ 6 G V(X) }. Again, is not affected. 

Expansion can cause some productions and/or starting strings to be repeated. If we had 
defined context-free grammars differently, taking S and V to be sets instead of multisets, we would 
not be able to apply the expansion process in general without losing track of some parses. 

The third basic transformation, called elimination, deletes a given production A — > 6 from V 
and replaces every remaining basic string a by D(a), where D(a) is a multiset defined recursively 
as follows: 

D(A) = {A,6}; 

D(a) = {a} , if a does not include A ; 
D(aa') = D(a)D(a') . 

If a has n occurrences of A, these equations imply that D(a) has 2 n elements. Elimination preserves 
L{Q) because it simply removes all uses of the production A — > 9 from parse trees. 

We can use elimination to make the grammar "e-free," i.e., to remove all productions whose 
right-hand side is empty. Complications arise, however, when a grammar is also "circular"; this 
means that it contains a nonterminal A such that A A. The grammars of most practical 
interest are non-circular, but we need to deal with circularity if we want to have a complete theory. 
It is easy to see that strings of infinite multiplicity occur in the multilanguage L(Q) of a reduced 
grammar Q if and only if Q is circular. 

One way to deal with the problem of circularity is to modify the grammar so that all the 
circularity is localized. Let N = iVj U N n , where the nonterminals of iV c are circular and those 
of N n are not. We will construct a new grammar Q' = (T,N',S' U S",V') with L(Q') = L(Q), 
for which all strings of the multilanguage L(S') = l+l{£(<r) | a G S' } have infinite multiplicity 
and all strings of L(S") = \${L{o) a G S" } have finite multiplicity. The nonterminals of Q' 
are N' = N c U N n U N' n U N£, where N' n = { A' \ A e N n } and N% = { A" A G N n } are new 
nonterminal alphabets in one-to-one correspondence with N n . The new grammar will be defined 
in such a way that L(A) = L(A') \±)L(A"), where L(A') contains only strings of infinite multiplicity 
and L(A") contains only strings of finite multiplicity. For each a G S we include the members of a' 
in S' and a" in S" , where a' and a" are multisets of strings defined as follows: If a includes a 
nonterminal in N c , then a' = {a} and a" = 0. Otherwise suppose a = a^Aiai . . . A n a n , where 
each 0^ G T* and each A^ G N n ; then 

a' = { a G A"ai . . . A'{ l _ x a k -iA' k a k A k+1 . . . A n a n | 1 < k < n } , 
a" = { ai A'{ ai . . . <q„} . 

(Intuitively, the leftmost use of a circular nonterminal in a derivation from a' will occur in the de- 
scendants of A' k . No circular nonterminals will appear in derivations from a" .) The productions V 
are obtained from V by letting 

V\A') = \^{a'\aeV(A)}, 
V'(A") = |+J{<r" \aeV{A)}. 

This completes the construction of Q'. 

We can also add a new nonterminal symbol Z, and two new productions 

Z -> Z, 
Z^e. 
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The resulting grammar Q" with starting strings ZS' l±l S" again has L(Q") = L{Q), but now all 
strings with infinite multiplicity are derived from ZS' . This implies that we can remove circularity 
from all nonterminals except Z, without changing any multiplicities; then Z will be the only source 
of infinite multiplicity. 

The details are slightly tricky but not really complicated. Let us remove accumulated primes 
from our notation, and work with a grammar Q = (T, N, S, V) having the properties just assumed 
for Q" . We want Q to have only Z as a circular nonterminal. The first step is to remove instances 
of co-circularity: If Q contains two nonterminals A and B such that A ^ + B and B ^ + A, we 
can replace all occurrences of B by A and delete B from N. This leaves L(Q) unaffected, because 
every string of L(Q) that has at least one parse involving B has infinitely many parses both before 
and after the change is made. Therefore we can assume that Q is a grammar in which the relations 
A^>+ B and B A imply A = B. 

Now we can topologically sort the nonterminals into order Aq,Ai,..., A m so that A^ ^ + Aj 
only if i < j; let Aq = Z be the special, circular nonterminal introduced above. The grammar will 
be in Chomsky normal form if all productions except those for Z have one of the two forms 

A — > BC or A — > a , 

where A,B,C G N and a G T. Assume that this condition holds for all productions whose left-hand 
side is Ai for some / strictly greater than a given index k > 0; we will show how to make it hold 
also for I = k, without changing L(Q). 

Abbreviations will reduce any productions on the right-hand side to length 2 or less. Moreover, 
if Ak — > V1V2 for v\ G T, we can introduce a new abbreviation Ak — > XV2, X — > vi; a similar 
abbreviation applies if V2 G T. Therefore systematic use of abbreviation will put all productions 
with Ak on the left into Chomsky normal form, except those of the forms Ak — > Ai or Ak — > e. 
By assumption, we can have Ak — > Ai only if I > k. If I > k, the production Ak — ► Ai can be 
eliminated by expansion; it is replaced by Ak — > for all G "P(A^), and these productions all 
have the required form. If Z = k, the production — > ^ is redundant and can be dropped; this 
does not affect L(Q), since every string whose derivation uses Ak has infinite multiplicity because 
it is derived from ZS' . Finally, a production of the form Ak — > e can be removed by elimination 
as explained above. This does not lengthen the right-hand side of any production. But it might 
add new productions of the form Ak — > A\ (which are handled as before) or of the form Aj — > e. 
The latter can occur only if there was a production Aj — > AJ! for some n > 1; hence A,- ^ + A 
and we must have j < k. If j = k, the new production Ak — > e can simply be dropped, because its 
presence merely gives additional parses to strings whose multiplicity is already infinite. 

This construction puts Q into Chomsky normal form, except for the special productions Z — > Z 
and Z — > e, without changing the multilanguage L(Q). If we want to proceed further, we could 
delete the production Z — > Z; this gives a grammar (/' with L(Q') x and no circularity. And we 
can then eliminate Z — > e, obtaining a grammar C?" in Chomsky normal form with L{Q") = L(Q'). 
If £ itself was originally noncircular, the special nonterminal Z was always useless so it need not 
have been introduced; our construction produces Chomsky normal form directly in such cases. 

The construction in the preceding paragraphs can be illustrated by the following example 
grammar with terminal alphabet {a} nonterminal alphabet {A,B,C}, starting set {^4}, and pro- 
ductions 

A -> AAa , A->B, A->e, B -» CC , C — > BB , C^e. 

The nonterminals are -/V n = {^4} and iV c = {B,C}; so we add nonterminals N' n = {A 1 } and 
N'n = {A"}, change the starting strings to 

S' = {A'} , S" = {A"} , 
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and add the productions 

A' -» A'Aa , A' -» A'A'a , A' B ; 
A" — > A"A"a , A" ->e. 

Now we introduce Z, replace C by B, and make the abbreviations X — > AY", X' — > A'y, X" — > A"y, 
y — > a. The current grammar has terminal alphabet {a}, nonterminal alphabet {Z, A, A' , A" , B, 
X,X',X",Y} in topological order, starting strings {ZA',A"}, and productions 

A' -> {A'X, A'X',5}, 
A " ^ {A"X", e} , 
B {BB,BB,e}, 

plus those for X, X', X", Y already stated. Eliminating the production i? — > e yields new 
productions A — > e, A' — > e; eliminating A" — > e yields a new starting string e and new productions 
A' —> X', A" — > X", X" — > a. We eventually reach a near-Chomsky-normal grammar with starting 
strings {Z, ZA', ZA", A", e} and productions 

Z^{Z,e}, 

A -» {AX, AY, AY, SB, 55, a, a, a, a} , 
A -» {AY, AX, A'Y, A"X', BB, BB, a, a, a} , 
A" -» {A"X",A"Y,a}, 

5 -» {BB,BB}, 

X -» {AY, a, a} , 
X' -» {A'Y, a} , 
X" -> {A"Y, a} , 

Y^{a}. 

Once a grammar is in Chomsky normal form, we can go further and eliminate left-recursion. 
A nonterminal symbol X is called left-recursive if X Xu for some to € V*. The following 
transformation makes X non-left-recursive without introducing any additional left-recursive non- 
terminals: Introduce new nonterminals N' = {A' | A € X}, and new productions 

{ B' -» CA' | A -> BC e V } , 
{ X -» a A | A ->■ a G V } , 
X' -► e, 

and delete all the original productions of V[X). It is not difficult to prove that L(Q') = L(Q) for 
the new grammar Q', because there is a one-to-one correspondence between parse trees for the two 
grammars. The basic idea is to consider all "maximal left paths" of nodes labelled Ai,...,A r , 
corresponding to the productions 

A\ — ► A2-B1 — ► A3B2B1 A r 5 r ._i5 r _2 . . . B\ — > aB r _iB r _2 ■ ■ ■ B\ 

in £/, where Ai labels either the root or the right subtree of Ai's parent in a parse for Q. If X 
occurs as at least one of the nonterminals {Ai, . . . , A r }, say Aj = X but A, / X for i < j, the 
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corresponding productions of Q' change the left path into a right path after branch j: 

A\ — ► ■ • ■ — ► AjBj—i . . . B\ — > aA' r Bj-i . . . B\ — > aB r —\A' r _-±Bj—\ . . . B\ 

— > ■ ■ ■ — > aB r —i . . . BjAjBj—i . . . B\ 
— > aB r —i . . . BjBj—i . . . B\ . 

The subtrees for Bi, ... , B r _i undergo the same reversible transformation. 

Once left recursion is removed, it is a simple matter to put the grammar into Greibach normal 
form [3], in which all productions can be written 

A — »■ aA 1 ...A k , k>0, 

for a £ T and A, Ai , . . . , A^ G N. First we order the nonterminals Xi, . . . , X n so that Xj — > XjX^ 
only when i < j; then we expand all such productions, for decreasing values of i. 

Transduction. A general class of transformations that change one context-free language into 
another was discovered by Ginsburg and Rose [2], and the same ideas carry over to multilanguages. 
My notes from 1964 use the word "juxtamorphism" for a slightly more general class of mappings; 
I don't remember whether I coined that term at the time or found it in the literature. At any rate, 
I'll try it here again and see if it proves to be acceptable. 

If F is a mapping from strings over T to multilanguages over T', it is often convenient to write 
a F instead of F(a) for the image of a under F. A family of such mappings F 1: . . . ,F r is said to 
define a juxtamorphism if, for all j and for all nonempty strings a and /?, the multilanguage (a/3) F ^ 
can be expressed as a finite multiset union of multilanguages having "bilinear form" 

a Fk (3 Fl or (3 Fk a Fl . 

The juxtamorphism family is called context-free if a Fj and e Fj are context-free multilanguages for 
all a E T and all j. 

For example, many mappings satisfy this condition with r = 1. The reflection mapping, 
which takes every string a = a\...a m into a R = a m ...a\, obviously satisfies (a(3) R = (3 R a R . 
The composition mapping, which takes a = a\...a m into a L = L(a\) . . . L{a m ) for any given 
multilanguages L{a) defined for each a G T, satisfies {a(3) L = a L f3 L . 

The prefix mapping, which takes a = a\...a m into a p = {e, a\, a\a<2, ■ ■ ■ , ai . . . a, m }, is a 
member of a juxtamorphism family with r = 3: It satisfies 

(a(j) p = a P l3 E td a z p p , 
(aft) 1 = a 1 ^ , 
(a(3) E = a E p E , 

where / is the identity and a E = e for all a. 

Any finite-state transduction, which maps a = a\ . . . a m into 

« T = {f(qo,ai)f(qi,a 2 )...f(q m -i,a rn )f(q rn ,e) | qj 6 g(qj-i,aj) } 

is a special case of a juxtamorphism. Here qo, ■ ■ ■ ,q m are members of a finite set of states Q, and 
g is a next-state function from Q x T into subsets of Q; the mapping / takes each member of 
Q x (T U {e}) into a context-free multilanguage. The juxtamorphism can be defined as follows: 
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Given q,q' G Q, let a qq be {f(q ,a 1 )...f(q m _ 1 ,a m ) \ q = q and qj G g(q j _ 1 ,q j ) and q m = q'}. 
Also let a q be a T as described above, when q = q. Then 

(a/?) 99 ' = l+J a 99 "/? 9 " 9 ' ; 

(a/3) 9 = l+J a qq 'f3 q ' . 
q'eQ 

The following extension of the construction by Ginsburg and Rose yields a context-free gram- 
mar Qj for L(Q) F: > , given any juxtamorphism family F\, . . . , F r . The grammar Q can be assumed in 
Chomsky normal form, except for a special nonterminal Z as mentioned above. The given context- 
free multilanguages a Fj and e Fi have terminal alphabet T', disjoint nonterminal alphabets N^ a ' F ^ 
and N( e ' F i\ starting strings S ( - c ' Fj ' ) and S ( - e ' Fj ' ) , productions V^ a ' F ^ and V^ )Fj \ Each grammar Qj 
has all these plus nonterminal symbols A Fi for all j and for all nonterminal A in Q. Each production 
A — > a in Q leads to productions A F ? — > { a \ a G S^ a ' Fj ^ } for all j. Each production A — > SC in 
leads to the productions for each A f j based on its juxtamorphism representation. For example, in 
the case of prefix mapping above we would have the productions 

A p — > B P C E , A P ^S 7 C P , A^B 7 ^, A B ^^C S . 

The starting strings for C/j are obtained from those of Q in a similar way. Further details are left 
to the reader. 

In particular, one special case of finite-state transduction maps a into {k ■ a} if a is accepted 
in exactly k ways by a finite-state automaton. (Let f(q, a) = a, and let f(q, e) = {e} or according 
as q is an accepting state or not.) The construction above shows that if L\ is a context-free 
multilanguage and L 2 is a regular multilanguage, the multilanguage L\ H L 2 is context-free. 

Quantitative considerations. Since multisets carry more information than the underlying sets, 
we can expect that more computation will be needed in order to keep track of everything. From 
a worst-case standpoint, this is bad news. For example, consider the comparatively innocuous 
productions 

A -> e , A -> e , 

M — > A)A , A 2 — > Ai^i , A„ -» , 

with starting string {A„}. This grammar is almost in Chomsky normal form, except for the 
elimination of e. But e-removal is rather horrible: There are 2 2 ways to derive e from A^. Hence 
we will have to replace the multiset of starting strings by {2 2 ™ • e}. 

Let us add further productions A^ — > afc to the grammar above, for < k < n, and then reduce 
to Chomsky normal form by "simply" removing the two productions Aq — > e. The normal-form 
productions will be 

A k - { 2 2k -*+ k -i ■ Aj^Aj-i | 1 < j < k } l+J { 2 2k -*+ k -> - aj \0<j<k}. 

Evidently if we wish to implement the algorithms for normal forms, we should represent multisets 
of strings by counting multiplicities in binary rather than unary; even so, the results might blow 
up exponentially. 

Fortunately this is not a serious problem in practice, since most artificial languages have 
unambiguous or nearly unambiguous grammars; multiplicities of reasonable grammars tend to be 
low. And we can at least prove that the general situation cannot get much worse than the behavior 
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of the example above: Consider a noncircular grammar with n nonterminals and with m productions 
having one of the four forms A — > BC, A — > B, A — > a, A — > e. Then the process of conversion to 
Chomsky normal form does not increase the set of distinct right-hand sides {BC} or {a}; hence 
the total number of distinct productions will be at most 0(mn). The multiplicities of productions 
will be bounded by the number of ways to attach labels {1, . . . , m} to the nodes of the complete 
binary tree with 2 n_1 leaves, namely m 2 . 

Conclusions. String coefficients that correspond to the exact number of parses are important 
in applications of context-free grammars, so it is desirable to keep track of such multiplicities as 
the theory is developed. This is nothing new when context-free multilanguages are considered 
as algebraic power series in noncommuting variables, except in cases where the coefficients are 
infinite. But the intuition that comes from manipulations on trees, grammars, and automata nicely 
complements the purely algebraic approaches to this theory. It's a beautiful theory that deserves 
to be remembered by computer scientists of the future, even though it is no longer a principal focus 
of contemporary research. 

Let me close by stating a small puzzle. Context-free multilanguages are obviously closed 
under W. But they are not closed under U, because for example the language 

{ a i b j c i d k | i,j,k > 1} U { a i b j c k d j \ i, j, k > 1 } 

is inherently ambiguous [9]. Is it true that L\ U L 2 is a context-free multilanguage whenever L\ is 
context-free and L 2 is regular? 
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